Sleep debt and body anxiety have long been a major topic of concern. And BMI(Body Mass Index) is an important international standard used to measure the degree of obesity and health.
The surveybmi data set comes from group survey made by Wenjuanxing (https://www.wjx.cn/vj/hQSvcig.aspx), which is one of the most authoritative questionnaire platforms in China.
The questionnaire is designed with 13 variables, including 8 quantitative variables and 5 qualitative variables.This survey is released on April 18 for 2 days, with 155 participants.
Limitations:
# Read data
survey = read.csv("surveybmi.csv")
# quick view of the first 6 rows
head(survey)
## X1.Your.age X2..Your.gender X3..Your.job X4..Your.height X5..Your.weight
## 1 19 female undergraduate 170 48
## 2 18 female undergraduate 164 55
## 3 18 female undergraduate 167 60
## 4 19 male undergraduate 174 76
## 5 18 female undergraduate 160 55
## 6 18 female undergraduate 169 50
## X7..What.time.do.you.sleep X8..How.long.do.you.sleep.every.day
## 1 after 1 am 6.0
## 2 12 am to 1 am 8.0
## 3 after 1 am 7.5
## 4 11 pm to 12 am 6.0
## 5 12 am to 1 am 8.0
## 6 11 pm to 12 am 8.0
## X9..How.long.do.you.work.or.study.per.day
## 1 10
## 2 4
## 3 12
## 4 12
## 5 13
## 6 10
## X10..How.long.do.you.exercise.per.day X13..How.is.your.sleeping.quality
## 1 2.0 often dream and not so good
## 2 0.0 occasionally dream and good
## 3 0.5 occasionally dream and good
## 4 2.0 occasionally dream and good
## 5 0.0 occasionally dream and good
## 6 2.0 often dream and not so good
## X14..Your.sleep.quality BMI How.long.you.sleep.per.day
## 1 3 16.61 less than 7
## 2 2 20.45 more than 7
## 3 2 21.51 more than 7
## 4 2 25.10 less than 7
## 5 2 21.48 more than 7
## 6 3 17.51 more than 7
# Create a new object of name
myname <- c(
"age",
"gender",
"identity",
"height",
"weight",
"resttime",
"sleephour",
"workhour",
"sporthour",
"sleepquality",
"slpqual",
"bmi",
"slphour"
)
# assign the new object to survey
names(survey) <- myname
# quik view of data
str(survey)
## 'data.frame': 155 obs. of 13 variables:
## $ age : int 19 18 18 19 18 18 23 23 19 18 ...
## $ gender : chr "female" "female" "female" "male" ...
## $ identity : chr "undergraduate" "undergraduate" "undergraduate" "undergraduate" ...
## $ height : num 170 164 167 174 160 169 169 180 170 173 ...
## $ weight : num 48 55 60 76 55 50 55 70 60 52 ...
## $ resttime : chr "after 1 am" "12 am to 1 am" "after 1 am" "11 pm to 12 am" ...
## $ sleephour : num 6 8 7.5 6 8 8 8 6 12 8 ...
## $ workhour : num 10 4 12 12 13 10 8 8 4 5 ...
## $ sporthour : num 2 0 0.5 2 0 2 0 4 1 2 ...
## $ sleepquality: chr "often dream and not so good" "occasionally dream and good" "occasionally dream and good" "occasionally dream and good" ...
## $ slpqual : int 3 2 2 2 2 3 4 1 2 3 ...
## $ bmi : num 16.6 20.4 21.5 25.1 21.5 ...
## $ slphour : chr "less than 7" "more than 7" "more than 7" "less than 7" ...
# show the minimum, maximum, mean and median of sleeping hour
summary(survey$sleephour)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 5.000 6.500 7.000 7.294 8.000 12.000
library(plotly)
## Loading required package: ggplot2
##
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
##
## last_plot
## The following object is masked from 'package:stats':
##
## filter
## The following object is masked from 'package:graphics':
##
## layout
p = plot_ly(data = survey, x = ~sleephour, type = 'box')
p
# Show the maximum, minimum, mean and median of bmi
summary(survey$bmi)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 13.56 19.70 21.26 22.18 23.89 36.89
# Call the ggplot library
library(ggplot2)
# Create a histogram to show the distribution of BMI
p = ggplot(data = survey, aes(x = bmi))
p + geom_histogram(aes(y = ..density..), binwidt = 0.5, fill = "lightblue", alph = 0.3) + geom_density(alpha=.2,fill = "red") + xlab('BMI')
## Warning: Ignoring unknown parameters: binwidt, alph
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
p = ggplot(data = survey, aes(x = resttime, fill = sleepquality))
p + geom_bar() + theme(panel.background = element_rect(fill = 'transparent', color = "gray"), axis.text.x = element_text(angle = 90, hjust = 0.5, vjust = 0.5, color = "black", size = 9))
# calculate the average bmi of people who sleep less than median amount and more than median amount
mean(survey$bmi[survey$sleephour < 7])
## [1] 23.01952
mean(survey$bmi[survey$sleephour > 7])
## [1] 21.56296
library(plotly)
p = plot_ly(data = survey, x = ~bmi, color = ~slphour, type = 'box')
p
# Calculate the IQR of BMI
iqrbmi = IQR(survey$bmi)
# IQR of BMI = 4.18, q1 of bmi = 19.7, q3 of bmi = 23.89
# UT of BMI = q3 + 1.5IQR = 30.16
UT = 23.89 + 1.5*iqrbmi
# LT of BMI = q1 - 1.5IQR = 13.43
LT = 19.7 - 1.5*iqrbmi
# Select the data from LT to UT
surveymain = subset(survey, survey$bmi<UT & survey$bmi>LT)
# Calculate the value of correlation between sleep duration and BMI
cor(surveymain$sleephour, surveymain$bmi)
## [1] -0.2742257
# Plot the scatter plot of sleephour and BMI
c=ggplot(surveymain,aes(x=sleephour,y=bmi))
c+geom_point() + geom_smooth(method = "lm", se = FALSE)
## `geom_smooth()` using formula 'y ~ x'
#constract a scatter plot
c=ggplot(survey,aes(x=weight,y=bmi))
c+geom_point()
#calculate the correlation coefficient
cor(survey$weight,survey$bmi)
## [1] 0.8614837
#calcutate the linear regresstion model
L=lm(survey$weight~survey$bmi)
#summary
L$coeff
## (Intercept) survey$bmi
## 2.846189 2.747777
summary(L)
##
## Call:
## lm(formula = survey$weight ~ survey$bmi)
##
## Residuals:
## Min 1Q Median 3Q Max
## -13.632 -4.338 -1.264 4.362 16.263
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.8462 2.9490 0.965 0.336
## survey$bmi 2.7478 0.1309 20.985 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 6.335 on 153 degrees of freedom
## Multiple R-squared: 0.7422, Adjusted R-squared: 0.7405
## F-statistic: 440.4 on 1 and 153 DF, p-value: < 2.2e-16
#darw on the scatter plot
c=ggplot(survey,aes(x=weight,y=bmi))
c + geom_point() + geom_smooth(method = "lm", se = FALSE)
## `geom_smooth()` using formula 'y ~ x'
#construct the residual plot
res=L$residuals
ggplot(survey,aes(weight,res))+geom_point()+geom_hline(yintercept = 0, colour="yellow")
In the relevant studies, it has claimed that BMI is closely related to weight(Hoor, Plasqui, Schols, Kok, 2018. Peterson, Thomas, Blackburn, Heymsfield, 2016). It also has stated that sleep duration more than 7 hours a day is associated with BMI decrease.(Sung, 2017).
Peterson, C. M., Thomas, D. M., Blackburn, G. L., & Heymsfield, S. B. (2016). Universal equation for estimating ideal body weight and body weight at any BMI. The American journal of clinical nutrition, 103(5), 1197–1203.
Sung, B. (2017). Analysis of the Relationship between Sleep Duration and Body Mass Index in a South Korean Adult Population: A Propensity Score Matching Approach. Journal Of Lifestyle Medicine, 7(2): 76–83.. doi: doi: 10.15280/jlm.2017.7.2.76
Weight-height relationships and body mass index: Some observations from the diverse populations collaboration. (2005). American Journal Of Physical Anthropology, 128(1), 220-229. doi: 10.1002/ajpa.20107
Style: APA